A Sentence-Trimming Approach to Multi-Document Summarization
نویسندگان
چکیده
We implemented an initial application of a sentence-trimming approach (Trimmer) to the problem of multi-document summarization in the MSE2005 and DUC2005 tasks. Sentence trimming was incorporated into a feature-based summarization system, called MultiDocument Trimmer (MDT), by using sentence trimming as both a preprocessing stage and a feature for sentence ranking. We demonstrate that we were able to port Trimmer easily to this new problem. Although the direct impact of sentence trimming was minimal compared to other features used in the system, the interaction of the other features resulted in trimmed sentences accounting for nearly half of the selected summary sentences.
منابع مشابه
Sentence Trimming and Selection: Mixing and Matching
We describe how components from two distinct multi-document summarization systems were combined. Twenty four possible combinations of components were considered. We observed some contrasts between conservative and aggressive sentence compression (i.e., trimming) in the context of multidocument summarization.
متن کاملMulti-candidate reduction: Sentence compression as a tool for document summarization tasks
This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization—a “parse-and-trim” approach and a statistical noisy-channel approach. We introduce the Multi-Candidate Reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates a...
متن کاملMulti-Candidate Reduction for Flexible Single-Document Summarization
Sentence compression techniques based on linguistically-motivated syntactic rules have proved effective in single-document summarization tasks. The addition of topic terms yields state-of-the-art performance, according to previous evaluations. Since “trimming” rules must be applied successively, optimal rule ordering presents a challenge. This paper describes the Multi-Candidate Reduction (MCR)...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملA Graph-based Approach to Cross-language Multi-document Summarization
Cross-language summarization is the task of generating a summary in a language different from the language of the source documents. In this paper, we propose a graph-based approach to multi-document summarization that integrates machine translation quality scores in the sentence extraction process. We evaluate our method on a manually translated subset of the DUC 2004 evaluation campaign. Resul...
متن کامل